A CONTEXT-FREE AND A 1-COUNTER GEODESIC 
LANGUAGE FOR A BAUMSLAG-SOLITAR GROUP 



MURRAY ELDER 

Abstract. We give a language of unique geodesic normal forms for the Baumslag- 
Solitar group BS(1, 2) that is context-free and 1-counter. We discuss the classes 
of context-free, 1-counter and counter languages, and explain how they are 
inter-related. 



1. Introduction 

In this article we give a simple combinatorial description of a language of normal 
forms for the solvable Baumslag-Solitar group BS(1, 2) with the standard generating 
set, such that each normal form word is geodesic, each group element has a unique 
normal form representative, and the language is accepted by a (partially blind) 
1-counter automaton. It follows that the language is context-free. 

Several authors have studied geodesic languages for the (solvable) Baumslag- 
Solitar groups, including Brazil Collins, Edjvet and Gill 0, Freden and McCann 
0, Groves |H], Miller and the author and Hermiller It is well known that 
Baumslag-Solitar groups are asynchronously automatic but not automatic 0, and 
the asynchronous language is not geodesic. Groves proved that no geodesic language 
of normal forms for a solvable Baumslag-Solitar group with standard generating set 
can be regular ISI, so we could say that context-free or 1-counter is the next best 
thing. 

Collins, Edjvet and Gill proved that the growth function (the formal power series 
where the n-th coeficient is the number of elements having a geodesic representative 
of length n) of a solvable Baumslag-Solitar group is rational and Freden and 
McCann have studied growth functions for the non-solvable case 

If G is a group with generating set G, we say two words u, v are equal in the 
group, or u 11, if they represent the same group element. We say u and v are 
identical if the are equal in the free monoid, that is, they are equal in Q* . 

Definition 1 (G-automaton). Let G be a group and S a finite set. A (non- 
deterministic) G-automaton Aq over H is a finite directed graph with a distin- 
guished start vertex qq, some distinguished accept vertices, and with edges labeled 
by (S^^ U {e}) x G. If p is a path in Aq, the element of (S*^) which is the first 
component of the label of p is denoted by w{p), and the element of G which is the 
second component of the label of p is denoted g{p). If p is the empty path, g(jp) is 
the identity element of G and w{p) is the empty word. Aq is said to accept a word 
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if there is a path p from the start vertex to some accept vertex such that 
w{p) = w and g{p) =g 1- 

Definition 2 (Finite state automaton; Regular). IfG is the trivial group, then Aq 
is a (non- deterministic) finite state automaton. A language is regular if it is the 
set of strings accepted by a finite state automaton. 

Definition 3 (Counter; 1-counter). A language is fc-counter if it is accepted by 
some Z'^ -automaton. We call the generators of !}■ and their inverses counters. A 
language is counter if it is k- counter for some fc > 1. 

For example, the language {a^fe^a" | n G N} is accepted by the Z^-automaton 
in Figure ^ with alphabet a, h and counters x\,xi. 



In the case of Z-automata, we assume that the generator is 1 and the binary 
operation is addition, and we may insist without loss of generality each transition 
changes the counter by either 0,1 or —1. We can do this by adding states and 
transitions to the automaton appropriately. That is, if some edge changes the 
counter by fc ^ 0,±1 then divide the edge into |fc| edges using more states. The 
symbols +, — indicate a change of 1, —1 respectively on a transition. 

Definition 4 (Pushdown automaton; Context-free). A pushdown automaton is a 
6-tuple {Q, S, r, T, go, A) where Q, S, F and A are all finite sets, and 

(1) Q is the set of states, 

(2) E is the input alphabet together with the empty word e, 

(3) F is the stack alphabet together with e (the empty symbol), 

(4) r is the transition function, 

(5) qo is the start state, 

(6) A Q Q is the set of accept states. 

The transition function takes as input a state and an input letter, and outputs a 
state and a stack instruction of the form 7 — > /3, which means pop 7 from the top 
of the stack then push (3 on the top of the stack. Note that e — > 7 means push 7 
onto the stack, 7 — > e means pop 7 off the stack, and e — > e means do nothing ( and 
in this case will be omitted). 

A word is accepted by the automaton if there is a sequence of transitions starting 
from the state qo with an empty stack, pushing and popping stack symbols, to an 
accept state. Note that you can always push new symbols onto the stack, but you 
can only pop if the correct symbol is on top of the stack. 

A language is context-free if it is the language of some pushdown automaton. 




e 



e 



Figure 1. A counter automaton accepting a^b^a^^. 
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As an example, the language {a"6" | n e N} is accepted by the pushdown 
automaton in FigureElwith alphabet a, b and stack symbols $, 1, and this language 
is not regular [5].[T^. 



qo # > 




(e,e^$) e (e,$^e) 

Figure 2. Pushdown automaton accepting a"&". 

Note that our definition of counter automata is not equivalent to a pushdown 
automata with a stack (with one type of token) for each counter, since in our 
definition, we cannot test the value of the counter until we are done reading the 
input. For this reason, these automata are sometimes referred to as "partially 
blind" or vision-impaired counter automata, since the cannot "see" whether the 
counter is non-zero except at the end. 

Definition 5 (Baumslag-Solitar group). The group with presentation 
{a,t I tat^^ = aP) is the solvable Baumslag-Solitar group BS{l,p), for 
p(EZ,p> 2. 

In this article we will consider the group BS(1,2). Let Q = {a,a~^,t,t~^} be the 
inverse closed generating set for BS(1, 2). We give a picture of part of the Cayley 
graph for BS(1, 2) in FigureOl From the side the Cayley graph looks like a binary 
tree. See [Hj for a detailed description of the Cayley graph. 
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Figure 3. Part of the Cayley graph for BS(1, 2). 

The paper is organised as follows. In Sections El and 01 we examine the various 
definitions of formal languages presented above, and establish their relative inter- 
sections and inclusions, which we illustrate in Figure In particular we prove that 



4 



MURRAY ELDER 



1-counter languages as defined are context-free. In Section 01 we define a normal 
form language for BS(1, 2) and prove that each normal form word is geodesic, and 
the language of normal form words bijects to the set of group elements. In Section 
ISjwe prove that this normal form language is 1-counter, which implies it is context- 
free. Then in the last section we show that the language of all geodesies for BS(1, 2) 
is not counter. 

2. 1-COUNTER LANGUAGES 
Lemma 1. Every 1-counter language is context-free. 

Proof. Let L be a 1-counter language accepted by a 1-counter machine M. We will 
construct a (non-deterministic) pushdown automaton N that accepts the language 
L, with stack symbols $+, $_ and 1. Let A/+ be a copy of M obtained by replacing 
transitions (a, by (a, £—*■!) and (a, — ) by (a, 1 — *■ e), and let M_ be a copy of M 
obtained by replacing transitions (a', — ) by (a', e — * 1) and (a', by (a', 1 — > e). 

N is constructed from these two automata M+ and Af_ as follows. The states of 
A'^ consist of two distinct states q+, q- for each state q of M, plus a new start state 
So and a new single accept state p. There is a transition labelled (e, e ^ $+) from 
So to the former start state {qo)+ in AI^. For each (7+ in there is a transition 
labelled (e, — > $_) from to the corresponding state g_ in M_, and a transition 
labelled (e, $_ ^ $+) from g_ to q+ in M+. 

Finally for every accept state q in M there is a transition labelled (e, $+ e) 
from q+ in M+ to the single accept state p, and (e, $_ e) from q^ in Af_ to the 
single accept state p. 

This new machine works by starting with an empty stack and pushing $+ on the 
bottom. Then if the old machine increments the counter, the new machine adds 
1 to the stack. From then on if the counter value never dips below zero, the new 
machine will stay in the A/_|_ states. However if there is ever a "pop 1" but the 
symbol on the stack is $+, pass over to M_. Then the height of the stack now 
represents the negative value of the counter, you stay in this side until the value of 
the counter comes back to zero, in which case you can switch. 

It follows that the language of N is precisely the language of the 1-counter 
machine L. □ 

Lemma 2. The language of strings of the form a"^b^"'a"¥^ is both counter and 
context-free but not 1-counter. 

Proof. The pushdown automaton and the Z^-automaton in Figure^both accept 
this language, so it is context-free and counter. 

Suppose by way of contradiction that the language is 1-counter, and let M be a 
1-counter machine for it with p states. Assume without loss of generality that each 
transition changes the counter by either 0,-1 or 1. 

Define ai — ,bi = b^ , 02 = ,b2 = b^ , and consider the word s ~ 01610262 
which belongs to the language. 

Consider the prefix ai = . Since this prefix is longer than the number of 
states, it must visit some state twice, so ai = xqi/oZq where yo represents a loop of 
length at most p. 

If going around yo causes a net change of zero in the value of the counter, then 
going around it twice would give a new word that is accepted by M, but not of the 
form a^^b"^ a^b"^ . So assume the net change is ko with |fco| > 1. 
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(a,e^l) (bA^E) {a,e^l) {h.l^e) 




qo (e^e^$) g (e,$^$) £ (e,$^e) 



{b,xi^) {a,X2) {b,x2^) 
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Figure 4. Pushdown automaton and 2-counter machine accepting a^^b^a^b' 



Let si = xqzq which has length at least — p, so must go around a loop in M. 
So si = xiyizi with yi a loop of length at most p. Again, if the net change in the 
counter going around y is zero then we can go around j/i twice and have a word 
accepted by M that is not in the language. 

If the net change is ki of the opposite sign to fcg then there is a word that goes 
I fell times around the loop yo then \ko \ times around yi, which keeps the final value 
of the counter at zero, so is accepted by M, but since we are pumping the prefix 
of s we have a word that is not in the language. 

Thus yi changes the counter by ki with |fci| > 1 and having the same sign of fco- 
Let S2 — xiZi with length at least — 2p. 

Iteratively we can write Si = XiyiZi with y^ a loop which changes the value of 
the counter by an amount ki of the same sign as fcg, until there are no loops left in 
XiZi, which does not happen until at least p iterations (since Si has length at least 
p^ — ip). 

Since XiZi has no loops, it has length at most p — 1. So it changes the value of 
the counter by at most li where |Zi| <p. Whereas, the sum of the \yi\ changes the 
value of the counter by at least p since each one contributes at least 1 to the sum. 

Now repeat this analysis for the subwords bi , 02 and 62 . 

If all the loops in each subword change the counter by the same sign, then we 
have a contradiction, since the net change of all the loops is greater than Ap whereas 
the net change of the four remaining XiZi segments is less than 4p, so they cannot 
cancel each other. 

Thus at least two subwords have loops of opposite signs. If the loops in ai have 
the same sign as the loops in 02 and 62, then the loops in 61 must have the opposite 
sign. So suppose that some loop in bi changes the counter by k, and some loop 
in a2 changes the counter by I of the opposite sign to k. Then pumping the first 
loop by |/| and the second by |fc| gives a word that is accepted by M and not in the 
language. 
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Otherwise if the loops in ai have the opposite sign to the loops in either 02 or 
62, then take a loop in ai which changes the counter by k and a loop in 02 or 62 
that changes the counter by / of the opposite sign to k. Then pumping the first 
loop by |/| and the second by |fc| gives a word that is accepted by M and not in the 
language. □ 

Corollary 1. 1-counter languages are not closed under concatenation or intersec- 
tion. 

Proof. The language C = {a"6" | n e N} is 1-counter but CC is not 1-counter by 
the previous lemma (Lemma |2Il- 

The languages D = {a"6"c™ | m, n e N} and E = {a'"&"c" | m, n e N} are 
1-counter, but DDE — {a"6"c" | n £ N} is not context-free [nji^Hl so by Lemma 
^is not 1-counter. □ 

However, we have 

Lemma 3 (Closure properties of /c-counter languages). IfC,C' are k- counter for 
k > 1 and L is regular, then C U C", C Cl L, CL and LC are all k-counter. 

Proof. Let M,M' be fc-counter automata for C,C', with start states qo,qQ, states 
S,S', and accept states A^A', respectively. Then construct a fc-counter automaton 
accepting CuC with a new start state pq joined to qo, q'g by two epsilon transitions. 

Let iV be a finite state automaton for L with states T, start state pq and accept 
states B. Construct a fc-counter automaton accepting C D L having states S x T, 
start state {qa,po), such that {q,p) is an accept state if q E A, p E B (they are both 
accept states), and if there are transitions from g to r in M labelled by (a, g) and p 
to s in labelled by a where g £ Z*', then there is a transition from {q,p) to (r, s) 
labelled {a,g). 

Construct a fc-counter automaton accepting CL with start state qo and accept 
states B by adding an epsilon transition from each accept state of M to po. 

Construct a fc-counter automaton accepting LC with start state po and accept 
states A by adding an epsilon transition from each accept state of to go- O 

Iterating the union operation a finite number of times gives 

Corollary 2. The union of a finite number of k-counter languages is k-counter. 

3. Context-free and not counter 

The language {a^h^'a'^ | n G N} accepted by the Z^-automaton in FigureQlis not 
context-free by standard results [HljCni- In this section we show that conversely, 
there is a language that is context-free but not counter. 

Consider a string of letters a, 6, c. We say a string contains a square if it has a 
subword of the form ww. An interesting result from combinatorics is that one can 
write out a square-free word in a, 5, c of arbitrary length. This is due to Thue and 
Morse and described in IH] (Chapter 2). In particular we have 

Proposition 1 (Thue- Morse) . Define a homomorphism f on {a, 6, c} by f{a) = 
abc, f{b) — ac and f{c) = b. Then for any i G N, /'(a) is square-free. 

For example, to compute f^{a) we have 
a abc — > abcacb — > abcacbabcbac. 

In order to show that a language is not counter we make use of the following 
lemma. 
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Lemma 4 (Swapping Lemma). // L is counter then there is a constant s > 0, 
the "swapping length", such that if w ^ L with length at least 2s + 1 then w can 
be divided into four pieces w = uxyz such that \uxy\ < 2s + I, \x\, \y\ > and 
uyxz € L. 

Proof. Let s be the number of states in the counter automaton, and let p be a 
path in the Z'^-automaton such that w{p) = w. li p visits each state at most twice 
then it cannot have length more than 2s, so p visits some state at least three times. 
Let u be the first part of w{p) until it hits this state, then x a non-trivial loop 
back to this state the second time, y a loop back a third time, and z the rest of 
w. So w{p) = uxyz ends at an accept state, and the second component of p equals 
g{u)g{x)g(y)g{z) =jk 1. Switching the orders of x and y, the path uyxz still takes 
you to the same accept state, and g{uyxz) 1 since all elements of commute, 
so uyxz G L. □ 

Note its similarity to the pumping lemmas for regular and context-free languages 
This lemma is only of any use if your word w has no squares, otherwise 
you can just swap the square and get the same word (that is a; = y). 

Theorem 3.1. There is a language that is context-free but not counter. 

Proof. Consider the language of all strings in a, 5, c of the form ww^, where is 
word obtained by reversing w. It is well known that this is a context-free language 
[HI, 53) since it is accepted by a pushdown automaton which uses the stack to store 
the first half of the word, then checks the last half of the word matches. 

Suppose by way of contradiction that this language is counter, with swapping 
length p as in Lemma 0| Let w be a square-free word from Proposition ^ of length 
at least 2p + 1. Then ww^ can be split into four subwords u, x, y, z such that uxy 
falls in the first w prefix. Since w has no squares and x, y are adjacent words then 
it must be that x ^ y. But uyxz will fail to be in the language because the second 
part will not be the reverse of the first part. □ 

In Figure |5l we have a diagram of sets of regular, 1-counter, context-free and 
counter languages, and by the above results we have shown the given inclusions. 

The fact that there are counter languages that are not context-free and vice 
versa can be observed by considering word problems for various groups. The word 
problem for a group G with generating set G is the set WP{G) = {w € Q* -.w — 1} 
of all words in the generating set that evaluate to the identity element. By work 
of Muller and Schupp ^3] , the word problem for the group is not a context-free 
language, whereas the word problem of the free group on two (or more) gener- 
ators is context-free. Elston and Ostheimer |3] proved that a group has a de- 
terministic counter word problem (with a so-called inverse property) if and only 
if it is virtually abelian, so the word problem for is counter. To see why 
WP{F2) is not counter, consider a Thue-Morse word made up of an arbitrary 
number of subwords (aaa), (aba), (ab~^a), followed by its "reverse" in the subwords 
(a^^a^^a^^), (a~^6~^a~^), (a^^ba^^). This word is in the word problem, but ap- 
plying the Swapping lemma (Lemma Q gives a word that is non-trivial. 

The first examples of languages that are counter but not context-free were given 
by Mitrana and Stiebe in . Mitrana and Stiebe give the following lemma, which 
they call the "interchange lemma" , which they use to show that the language of 
palindromes, and the language \i > 0}*, are not counter. We include it here 

for completeness, and to show how it difi'ers from the Swapping Lemma above. 
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Figure 5. Intersections of the formal languages 

Lemma 5 (Interchange Lemma |13)1. // L is the language of a G-automaton 
where G is an abelian group, then there is a constant p such that for any word 
X £ L of length at least p, and for any given subdivision of x into subwords 
V1W1V2W2 ■ ■ - WpVp+i with Iwil > 1, there are some r, s such that the word obtained 
from X by interchanging Wr and Wg is in L. 

4. The normal form language 

Recall that BS(1,2) = {a,t \ tat^^ — a^) with the (standard) inverse closed 
generating set Q — {a, a^^, i, t~^}. We wish to describe geodesic words with respect 
to this generating set. 

Definition 6 {E, N, P, X). A word is of the form E if it is a*. A word is of the 
form N if it has no t letters and at least one t~^ letter. A word is of the form P if 
it has no t~^ letters and at least one t letter. 

A word is of the form X if it is the concatenation of a P word of t- exponent k, 
followed by an N word of t- exponent (— fc). That is, an X word is a word of type 
PN with zero t-exponent. 

Benson Farb called words of type X "mesas" , since drawing an X word in the 
Cayley graph resembles this land formation. See Figure El 

While the following fact is well known, we include an elementary proof of it here 
for completeness. 

Lemma 6 (Commutation). // u has zero t-exponent then au — ua and a^^u = 
ua^^ . 

Proof. If u is type X then u =bs a' so au = a'^'^^ = ua. 

If u is type NP then let u — vw where v is type N with i-exponent —k and w 
is type P (so has i-exponent k). Each time we push a' past a t~^ it becomes a^' 
since at^^ — t~^a^. Then au = avw = va^ w. Each time we push past a i it 
becomes since a'^t = ta. So au = avw = va^ w = vwa = ua. Finally if u is 

any other form, first replace each occurrence of taH^^ in u by a^*. Then u becomes 
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Figure 6. An X word 

a word of type NP with zero i-exponent. We can pass a through this word as in 
the previous case, and then put u back in its original form and we are done. □ 

Lemma 7 (Miller [I2|). Every geodesic word in Q* is a subword of a word of type 
NPN or PNP. 

See Lemma 1 of jS] for a proof. We can use this lemma to describe a subset of 
geodesic words that represent every group element. 

Define a type NP< word to be a word of type NP with non-positive t-exponent 
sum, and type 7VP> to be a word of type NP with positive t-exponent sum. 

Lemma 8 (Ten types). Every element of BS{\, 2) has a geodesic representative in 
Q* that is one of ten types: 

E, X, N, XN, NP<,XNP having t-exponent < 0, or 

P, PX, NPy, NPX having t-exponent > 0, such that no more than three a or 
letters can occur in succession in the geodesic. 

Hermiller and the author used a similar characterisation in our work on minimal 
almost convexity [S]. 

Proof. Every group element can be represented by some geodesic word in Q* . If a 
geodesic word has no t^^ letters then it is type E. Otherwise by Lemma it is a 
word of type N, P, NP, PN, NPN or PNP. 

If the geodesic is type NP then it either has non-positive t-exponent sum, so is 
type NP<,, or positive t-exponent sum, so is type iVP>. 

If the geodesic is type PA'' then it either has zero t-exponent sum, so is type 
AT, negative t-exponent sum, so is type XN , or positive t-exponent sum, so is type 
PX. 

Suppose the geodesic is a word w of type NPN . If w has positive t-exponent 
sum it is type NPX. If w has zero t-exponent sum, then write it as ux where u 
is type A^P with zero t-exponent sum and x is type X. By Lemma Elw =bs xu 
which has the same length and is type XNP. If w has negative t-exponent sum, 
then w — a'^^t^^uta'^^txt~^a'^^t~^v where u is type E or A^P with zero t-exponent 
sum, X is type E or X, v is type E or A^, and et G Z. Then by Lemma|Sl 

w =BS a'^+''^+'''{t-'^ut){txt~^)t-^v 

=BS a''+'^+'''{txt-^){t-^ut)t-hi 

which is not geodesic since we can cancel tt^^ at the end. 

Finally, suppose the geodesic is a word w of type PNP. If w has negative 
or zero t-exponent sum it is type XNP. If w has positive t-exponent sum, then 
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w = a'^^txt ^a'^^t ^uta'^^tv where x is type E or X, u is type E or NP with zero 
i-exponent sum, v is type E or P, and ti G Z. Then by Lemma El 

w =BS a'''+'^+''^{txt-'^){t-'^ut)tv 
=BS a'^+''^+^^{r^ut){txr^)tv 

which is not geodesic since we can cancel at the end. 

The additional condition that no more than three a's are allowed in succession 
is obtained by observing that =bs t^at^^ so any power of a greater than five 
is not geodesic, and since = ta?t~^ and = ta^t~^a = ata?t~^ we choose 
to replace a-exponents of 4 or 5 by subwords of the same length. An identical 
argument eliminates powers of a^^ greater than three. □ 

Definition 7 (Run). An iV-run is a word of the form 
A P-run is a word of the form 

a^Ha^H . . .ta^''-Ha"' . 

We can write a run in shorthand by just writing the a-exponents. For example, 
a^t^^at^^a^t^^at^^a^^ can he written as 2101(— 1). 

We call the a-exponents entries of the run. A run is non-trivial if it has at least 
one non-zero entry. Note that a run that has at least one t or letter will have 
at least two entries, since by definition a run starts and ends with a power of a 
(possibly a^ ). 

We say a geodesic has at most one non-trivial run if it can be expressed as the 
concatenation of geodesic N - or P-runs such that at most one factor is non-trivial. 
For example, the word t^a'^t^^at^'^ can be written as (t^)(a^t^^at^^), so has at 
most one run. 

Drawing the A^-run represented by 2101(— 1) in the Cayley graph we start to 
see what behaviour is allowed in a geodesic. For instance, the sub-runs 1(— 1) and 
(—1)1 are not allowed since 

at~^a~^ — > t~^a a~^t~^a — > t~^a~^. 

Also, if the A^-run 2101(— 1) were preceded by a t~^ then we would have t~^a^ 
which can be written as at"^. In fact, the only time you could ever see an entry 
that is not 0, 1 or —1 is at the start of an A^-run, or the end of a P-run. 

Lemma 9 (No |i| > 6). If a run represents a geodesic word and has an entry i that 
is not one of 1,0 and (—1), then i must be one o/ 2, 3, 4, 5, (— 2)(— 3), (— 4), (— 5) 
and occurs at the start of an N-run or the end of a P-run. 

Proof. If z > 6 occurs at any point in a run then a^ — s- ta'^t^^ so the run is not 
geodesic. 

For A^-runs, if i > 2 occurs after the start of the run then t~^a'^ at^^ so the run 
is not geodesic. If i < —2 occurs after the start of the run then t^^a^"^ — » a~^t~^ 
so the run is not geodesic. 

For P-runs, if i > 2 occurs before the end of the run then a^t — > ta so the run is 
not geodesic. If i < — 2 occurs before the end of the run then a~^t ta~^ so the 
run is not geodesic. □ 
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Figure 7. The TV-run 2101(-1) 

Lemma 10 (No consecutive 1(— 1), (— 1)1). A geodesic run cannot contain 1(— 1) 
or (-1)1. 

Proof. For an A^-run: 

1(-1) - 
(-1)1 - 

For a P-run: 

(-1)1 

SeeFigureEl □ 
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Figure 8. No 1(-1), (-1)1 in a run 
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Lemma 11 (No consecutive 11, (— 1)(— 1)). There exist rewrite rules which do not 
increase length which can he applied to a geodesic run to eliminate all occurrences 
of consecutive 11 or (— 1)(— 1) after the first two entries of an N-run and before the 
last two entries of a P-run. 

Proof. Let z G Z. 
For an A^-run: 

ill (i + l)0(-l) 

z(-l)(-l) ^ (z-l)Ol 



aH-^at-^a a'+^t-'^a~'^ 
aH-^a-^t^^a-^ a'-H-^a. 
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Figure 9. No ill, in an A''-run 



These moves are illustrated in Figure |2| We can always perforin these rewrites 
to get a word of the same length or shorter. That is, suppose you have an iV-run, 
which is geodesic so we assume has no 1(— 1) or (—1)1. Starting at the right end of 
the iV-run, if there is an ill, we know that i > 0. Replacing this by {i + 1)0(— 1) 
gives a word that is not geodesic if i > 0, otherwise gives 10(— 1). Now if the 
preceding entry is (—1) the word is not geodesic, so is 0, 1 or we are at the start of 
the run. A similar argument holds when we see i(— 1)(— 1). 

So iterate this procedure until the start of the run is reached. This eliminates 
all occurrences of adjacent nonzero entries after the first two entries. That is, if the 
A^-run starts with 110 for example, the rules don't apply. 

For a P-run: 

Hi ^ (-l)0(i+l) atata' a-^i^a^+i 

(-l)(-l)i ^ lO(i-l) a-Ha-Ha' at^a'-^. 

Similarly we can always perform these rewrites to get a word of the same length 
or shorter, this time starting at the left end of the word and moving right, so we 
can eliminate all adjacent nonzero entries except in the last two positions. □ 

Next we will show that every geodesic of one of the ten types can be "pushed" into 
a geodesic word for the same group element that have at most one non-trivial run. 
As an example, if w = a'^°ta'^^t . . .a'^''ta^t~^a^''t~^ . . .t'^a^^t~^a^° is a geodesic 
X word, then we can push the inner subword a'^'^ta^t^^a^'' to ia^i" ^ a''' , and 
iteratively push at each level to get 

i'=a"i-ia'''+''n-i . . .t~'^a'''+'iH''^a''°+'^° . We show this in Figure [TOI 

Lemma 12 (At most one run). Every group element is represented by some geodesic 
of one of the ten types having at most one non-trivial run. 

Proof. By Lemma |H1 each group element is represented by some geodesic of one of 
the ten types. If the word is type E, N, P then there is at most one non-trivial run. 
If it is X, XN or PX then by Lemma IHl we can push a letters to one side of the X 
word to get at most one non-trivial run, as we did in the example above. For NP<, 
words we have w — wnwnp where wnp has zero t-exponent, so by LemmaElwe 
can push a letters to the left of the NP word to get at most one run non-trivial 
run. For XNP words we have w = wxWnWnp where wnp has zero t-exponent, 
so by Lemma El we can push a letters to one side of the X and NP words to get 
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Figure 10. Pushing an X word to have one non-trivial iV-run. 

at most one non-trivial run. For iVP< words we have w = WNpwp where wnp 
has zero i-exponent, so by Lemma El we can push a letters to the right of the NP 
word to get at most one non-trivial run. For NPX words we have w = wnWnpWx 
where wnp has zero i-exponent, so by Lemma we can push a letters to one side 
of the X and NP words to get at most one non-trivial run. □ 

Given that every word can be pushed into a word having at most one non-trivial 
run, and we can choose which patterns are not allowed in a run, we are ready to 
define the normal form language. 

The only issue that remains is the prefix of each run. For example, a geodesic 
of type X can be pushed into a word with exactly one A^-run. The start of this 
run can be chosen to be either a?t~'^ ^ a?t~^ , a~^t~^ or a~'^t~^, for if the run starts 
with 1 then tat~^ — > so is not geodesic. If it starts with 4 or 5 then by Lemma 
IHla* ^ ta^t^^ and ta^t^^a so we elect to write it starting with a 2 instead, 

and if the run starts with i > 6 then it is not geodesic. 

The next few entries could be any one of the following: 
200, 201, 210, 300, 301, 30(-l), 310 or the negatives of these. 

Note that the prefix 20(— 1) is not allowed since t^a^t~^a~^ is not geodesic, 
whereas 30(— 1) is allowed since Pa'^t~^a~^ is geodesic. See Figure ITTI 



20{-I) 



30(-l) 













































Figure 11. Prefixes for A^-runs of an X word. 

Each case is treated separately in the following lemma. Then after these prefixes 
(suffixes for P-runs) the run has only 0, 1, (—1) with no consecutive nonzero entries. 

Lemma 13 (Prefixes/suffixes of runs). In this lemma we assume that each word 
has been pushed into a word with at most one non-trivial run, and that each run 
has at least three t^^ letters. 
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• The N-run in a geodesic word of type X, XN, XNP with non-positive t- 
exponent sum must start with one of 

200, 201, 210, 300, 301, 30(-l), 310 or the negatives of these. 

• The N-run in a geodesic word of type N, NP< with non-positive t-exponent 
sum must start with one of 

000, 001, 010, 100, 101, lO(-l), 110, 200, 201, 20(-l), 210, 300, 301, 30(-l), 310 
or the negatives of these. 

• The P-run in a geodesic word of type P, iVP> with positive t-exponent sum 
must end with one of 

000, 100, 010, 001, 101, (-1)01, Oil, 002, 102, (-1)02, 012, 003, 103, (-1)03, 013 
or the negatives of these. 

• The P-run in a geodesic word of type PX, NPX with positive t-exponent 
sum must end with one of 

002, 102, 012, 003, 103, (-1)03, 013 or the negatives of these. 



Proof. If an A'^-run starts with ill or i(— 1)(— 1) then by Lemma [TTl we can replace 
ill by {i + 1)0(— 1) and i(— 1)(— 1) by {i — 1)01 without increasing length. Thus 
the first three entries of an iV-run will include a 0. 

If an A^-run in a word of type X, XN or XNP starts with i with | j| > 4 then we 
can replace ta'^'^H^^ by t^a^t~^aH~^ to get a word of the same type and preserving 
length. If an A^-run in a word of type X,XN or XNP starts with i with |i| < 1 
then we can replace ta^t~^ by reducing length, contradicting the fact that the 
word is geodesic. Thus an A^-run in a word of type X, XN or XNP starts with 
2, 3, (-2) or (-3). 

This gives the following possibilities for the first three entries: 
200, 201, 20(-l), 210, 2(-l)0, 300, 301, 30(-l), 310, 3(-l)0or the negatives of these. 
We can eliminate 2(— 1)0 and 3(— 1)0 since they encode a^t~^a~^t~^ = a^~^t~^at~^ 
for i = 2, 3 so are not geodesic. We also observe that 20(— 1) encodes t'^a^t'^^a^^ 
which is not geodesic (as seen in Figure [TT|) . 

This leaves 200, 201, 210, 300, 301, 30(-l), 310 (or their negatives) as the possible 
prefixes to the A^-run in a geodesic of type X, XN or XNP. It is easy to check 
that each of these prefixes is geodesic. 

If the A^-run in a word of type N,NP< starts with i with |i| > 4 then we can 
replace ta'^^H~^ by t'^a'^t~^aH~^ preserving length. Note that they become words 
of type XN or XNP. If the A^-run in a word of type A^, A^P< starts with i with 
i| < 3 then we can have prefixes of the form iO, ilO when j > and i(— 1)0 when 
i < 0. 

Explicitly, this gives 
000, 001, 010, 100, 101, lO(-l), 110, 200, 201, 20(-l), 210, 300, 301, 30(-l), 310 
or their negatives. It is easy to check that each of these prefixes is geodesic. Note 
that in this case we cannot eliminate 20(— 1) since there are no preceding Vs. 

The proof for P-runs follows a similar argument, and is omitted. □ 



Lemma 14 (Short runs). In this lemma we assume that each word has been pushed 
into a word with at most one non-trivial run, and that each run has no more than 
two t^^ letters. 
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• The geodesies of type X, XN and XNP are the set Li of words of the form 

taH-^a^, zj- (±2)0,(±3)0,21,31,(-2)(-l),(-3)(-l); 

taH-^aH-^a'', ijk = (±2)00, (±3)00, 201, (±3)01, (-2)0(-l), 

(±3)0(-l), 210, 310, (-2)(-l)0, (-3)(-l)0; 
t^aH-^aH-^a^, ijk = (±2)00, (±3)00, 201, (±3)01, (-2)0(-l), 

(±3)0(-l), 210, 310, (-2)(-l)0, (-3)(-l)0; 
taH-^aH-^aH, ijk = 201, (±3)01, (-2)0(-l), (±3)0(-l). 

• The geodesies of type N and NP<c are the set L2 of words of the form 

aH-^a^, ij=^ 00, (±1)0, (±2)0, (±3)0,0(±1),0(±2),0(±3), 

11,21,31, (-l)(-l),(-2)(-l),(-3)(-l); 

aH-^aH, ij = 0(±1), 0(±2), 0(±3), 11, 21, 31, 

(-l)(-l),(-2)(-l),(-3)(-l); 

aH-^aH-^a^, ijk = 000, (±1)00, (±2)00, (±3)00, 0(±1)0, 0(±2)0, 0(±3)0, 

00(±1),00(±2),00(±3), (±1)0(±1), (±2)0(±1), (±3)0(±1), 
110, 210, 310, (-1)(-1)0, (-2)(-l)0, (-3)(-l)0; 

aH-^aH-^aH, ijk = 00(±1), 00(±2), 00(±3), (±1)0(±1), (±2)0(±1), (±3)0(±1); 

a't-^aH-^aV, ijk = 00(±1), 00(±2), 00(±3), (±1)0(±1), (±2)0(±1), (±3)0(±1). 

• The geodesies of type P and NP^ are the set L3 of words of the form 

aHa', ij ^ 00, 0(±1), 0(±2), 0(±3), (±1)0, 11, 12, 13, 

(-l)(~l),(-l)(-2),(-l)(-3); 
t-'a^ta^ zj= (±l)0,ll,12,13,(-l)(-l),(-l)(-2),(-l)(-3); 

aHaHa^, ijk = 000, 00(±1), 00(±2), 00(±3), (±1)0(±1), (±1)0(±2), (±1)0(±3), 

0(±1)0, 011,012, 013,0(-l)(-l),0(-l)(-2),0(-l)(-3); 
t-^aHaHa^, ijk ^ (±1)0(±1), (±1)0(±2), (±1)0(±3); 
t-'^aHaHa^, ijk = (±1)0(±1), (±1)0(±2), (±1)0(±3). 

• The geodesies of type PX and NPX (must have positive t-exponent) are 
the set L4 of words of the form 

aHaHaH-'^, ijk = 00(±2), 00(±3), 012, 013, 0(-l)(-2), 0(-l)(-3), 
102,10(±3),(-l)0(-2),(-l)0(±3). 

Proof. The proof is by exhaustive search. For the first two cases we have either one 
or two letters, so we consider tPa^t~^aH'' and tPa'^t~^aH~^a^t'^ . The i-exponent 
must be non-positive, so p + g < 1 in the first case and p + g < 2 in the second case. 
For the a-exponents, \i\ < 3 and |j|, < 1. This gives a finite set of possibilities, 
so we run through each and check if it gives a geodesic. Note that the pattern 
20(— 1) is not a geodesic if it appears in an A^-run preceded by a t, yet it is geodesic 
if it is in a or NP> geodesic. 

By Lemma [TTI we choose to reject runs of the form («,1,1) and (z,— 1,— 1) in 
favour of {i + 1, 0, —1) and (z — 1, 0, 1) respectively, so that we never see three non- 
zero entries in a row, even at the start of a run. The details of the exhaustive check 
are omitted. 

For the third and forth cases we have either one or two t letters, so we consider 
t^PaHaH"'^ and t^Pa^taHa'^t^^ . The f-exponent must be positive, so p, g = in the 
third and p + q <1 m the forth cases. For the a-exponents, |fc| < 3 and |j|, < 1. 
This gives a finite set of possibilities, so we run through each and check if it gives 
a geodesic. 
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By Lemma ITTI we choose to reject rmis of the form (1,1, z) and (— 1,— l,z) in 
favour of (—1, 0,i + l) and (1, 0, z — 1) respectively, so that we never see three non- 
zero entries in a row, even at the end of a run. The details of the exhaustive check 
are omitted. □ 

Definition 8 (Normal form). There are ten distinct types of normal form words. 

• Type MT E words are precisely e, a^^ , a^^, a^^ . 

• Type NTxtJ^^xn and MJ-xnp, all with zero or negative t-exponent, are 
the words: t''a'"t~^a^'-H~^ . . . a''H~'^a^'>t"' such that fc > and I > k + m, 

eg ^ ifm > 0, the N -run starts with one of 200, 201, 210, 300, 301, 30(-l), 310 
or the negatives of these, and after this has only 0, 1, (—1) with no consecu- 
tive nonzero entries (that is, no 1(— 1), (— 1)1, 11 or (— 1)(— 1) in the run). 

If there are less than three t^^ letters in the run, then the word is in the 
set Li of Lemma \T^ 

• Type AfJ-N and J\f T n P<_ , all with negative t-exponent, are the words: 
a^'t^^a^'-H-^ . . .a^H'^a^ot'' such that < k < I, ^ if k > 0, the 
N-run starts with one of 

000, 001, 010, 100, 101, lO(-l), 110, 200, 201, 20(-l), 210, 300, 301, 30(-l), 310 
or the negatives of these, and after this has only 0, 1, (—1) with no consec- 
utive nonzero entries. 

If there are less than three t^^ letters in the run, then the word is in the 
set L2 of Lemma \T^ 

• TypeMTp and J\fJ-NP^ , o,ll with positive t-exponent, are the words: 
t~''a''°ta''H...a^'-Ha'" 

such that < k < I, ^ if k > 0, the P-run ends with one of 
000, 100, 010, 001, 101, (-1)01, Oil, 002, 102, (-1)02, 012, 003, 103, (-1)03, 013 
or the negatives of these, and before this has only 0, 1, (—1) with no consec- 
utive nonzero entries. 

If there are less than three t letters in the run, then the word is in the 
set L3 of Lemma \I^ 

• Type MTpx and J\fJ- mpx , all with positive t-exponent, are the words: 
t~''a''Ha'H . . . a''-ita''t"™ such that fc > 0, to > and k + m < I, eo ^ if 
k>0, the P-run ends with one 0/ 002, 102, 012, 003, 103, (-1)03,013 or the 
negatives of these, and before this has only 0, 1, (—1) with no consecutive 
nonzero entries. 

The P-run must have at least two t letters since the t-exponent of the 
word is positive. If there are less than three t letters in the run, then the 
word is in the set L4 of Lemma \14\ 

Lemma 15 (The language of normal forms surjects to the group). Every group 
element is represented by a normal form word. 

Proof. By Lemma 1121 every group element is represented by a geodesic having 
at most one run. Then by Lemma 1111 we can remove any occurrences of 1 1 and 
(— 1)(— 1) in the run (except possibly at the start of N and NP< words and the 
end of P and NPy words) without lengthening the word. Then if the resulting run 
does not start (or end) with one of the number patterns given in Lemma [T^ relative 
to its type, it is not geodesic, and if it does, the word is in normal form. □ 

Definition 9 (HNN-extension). If G is a group with presentation {Q \ TL) and 
(f> : A ^ B is an isomorphism of subgroups A, B C G, define the HNN-extension 
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of G by (j) to be the group with presentation {Q, t \ TZ, {tat ^ = <P{a) : a G ^})- 
The generator t is called the stable letter and A, B are called associated subgroups. 

The group BS(1,2) is an HNN-extension of (a) with the isomorphism 0(a) = 
a^ between associated subgroups (a) and (a^). The following fact about HNN- 
extensions can be read in jlUj. 

Lemma 16 (Britton's Lemma). If w is a word containing a t^^ letter in an HNN- 
extension of Grj, with associated subgroups A, B and if w =g^ 1 then w must contain 
a subword (called a pinchj of the form tat~^ or t~^<p{a)t for some element a € A. 

Corollary 3 (t-exponent). For each element g G BS{1,2) there is an integer k 
such that every word for g has t-exponent k. 

Proof. If w represents the identity and has no t^^ letters then its i-exponent sum 
is zero. If w represents the identity and has t^^ letters then by Britton's lemma 
it contains a pinch. Removing a pinch leaves the t-exponent of w unchanged, so 
either you can remove all t^^ letters, in which case the t-exponent sum was zero, 
or you cannot remove all t^^ letters, in which case the word did not represent the 
identity. 

If w and u are two words for the same group element with t-exponents k and I 
respectively, then wu^^ =bs 1 ^nd has i-exponent fc — ^ = 0, so w and u have the 
same t-exponent. □ 

Lemma 17 (a-exponents). The X word w — t''aH^^a'^''-^t^^ . . . t^^a'" represents 
the element where 

k~l 

2 = 

Moreover if each je^j < 1 for all i < k — 1, \j\ > 2 and ek-i is zero or the same 
sign as j, then \N\ > 4. 

Also, the X word w = a'^°ta'^^t. . .ta'^''^^taH~^ represents the element a^ where 

fe-i 

iV - 2'=j + ^ 2''e„ 

1=0 

and moreover if each je^j < 1 for all i < k — 1, \j\ > 2 and ek-i is zero or the same 
sign as j , then \N\ > 4. 

Proof. To prove the first assertion we will use induction on fc. If k — 1 we have 
w = taH-^a^o = a^J+^f. 

Assuming the statement holds for k, then 
w = t^+'^aH-'^a^^t-'^a^>'-H-'^ . . . f'^a^" 
= t''a'^^+''H-^a'"'-H-^ . . . t-^a'° = 
where TV = 2'= (2j + efe) + ^ to '^'^^ ■ 

The smallest possible value for |iV| is when |j| = 2, et-i = and each is 
In this case 

|iV| >2'=(2) + + E-ro2^(-l) 

= 2'=(2)-E-=o 2* 
= 2*^(2) - (2'=-! - 1) 

> 2(2) - (1 - 1) = 4 since fc > 1. 

To prove the second assertion we will again use induction on fc. If fc = 1 we have 

w = a^HaH-^ = a^^+'°. 
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Assuming the statement holds for k, then 
w = a^H... ta'"'~Ha''HaH-''-^ 
= a^^t . . . ta^''-Ha'^'+''H~^ = 
where N = 2'=(2j + e^) + ^-^0 ^^^^ 

The smallest possible value for |A^| is when \j\ = 2, ek-i = and each Ci is ^ j^- 
In this case 

|iV| >2''^(2) + + E^Jo'2^(-l) 

= 2'=(2)-Eto2^ 
= 2'=(2) - (2*^-1 - 1) 

> 2(2) - (1 - 1) = 4 since fc > 1. □ 

Lemma 18 (Uniqueness for JVJ^e UJVJ^x)- Ifw,u G MT e ^^MTx and w —bs u 
then w and u are identical. 

Proof. If w,u G MTe then w = and u = and a' —bs o,'' means a*~^ = 1, so 
i — j and w and u are identical. 

If e UTx then we can write w = t^a'"'t~'^a'"'-^t~^ . . . a''^t~^a''° with fc > 0, 
which evaluates to the power N with |iV| > 4 by Lemma El so w cannot be equal 
to a word in MTe- 

If M G NTx andw ^bs w then we can write u = t'a'Tt-^a'"-H-^ . ..t-^a'^H-^ . . . a^H^^ a''"> , 
where without loss of generality we are assuming that k < I. Since both words eval- 
uate to the same power of a we have 

efe2'= + efc_i2'=-i + ... + ei2 + eo = 77/2' + 77,_i2'-i + ... + 77^2*^ + ... + 77i2 + 770. 

Let 7 G N such that ej — rjj for all j < i and 7^ 77^. Then cancelling and 
dividing through by 2' we have 

(1) ek2''-' + efc_i2'=-i-^ + ... + £,= r;,2'-' + ?7i-i2'-^-' + . . . + 77,. 

If 7 = fc then \ek\ = 2 or 3 and we have Ck = 77;2'^'^ + 77/_i2'^^^'' + . . . + 77^. If 
I = fc then Ek — rjk so w and u are identical. If Z > fc + 1 then \ek\ = |77;2'^'^ + 
77/_i2'^'''^^ + . . . 77fe| > 4 since \r]i\ > 2 and 77/-! is either or the same sign as 77;, 
but |e;s| < 3 so this is a contradiction. 

If 7 < fc then Ei, 77i are either 0, ±1 since they occur in the middle of a run. By 
Equation^they must be of the same parity, and they cannot both be zero so one is 
1 and one is (—1). If 7 + 1 < fc then e^+i = rj^^i = and we contradict the equation 
since one side is equal to 1 mod 4 and the other is (—1) mod 4. 

So 7 + 1 = fc, so the run in w starts with 210 or 310 (or their negatives). Then 
w = t''a^t~^aw" and u = t^u't~^a~^w" with s = 2, 3 so 77' =bs a*"*"^ so is or a"*, 
which by Lemma |S1 is written as ta'^t~^ if it occurs in a normal form word. Then 
the run in u must start with either 3(— 1) or 20(— 1), neither of which is allowed in 
a normal form word, so w and u are identical. □ 

Lemma 19 (Uniqueness for JVTnUJ^Txn)- Ifw,u G MJ^n^J^^xn andw =bs 
u then w and u are identical. 

Proof. If 77; and 77, are two normal form words representing the same group element, 
then they have the same t-exponent by Lemma|3| liw.u G MJ-xn with t-exponent 
(— fc) then t^w,t^u are in MTx so by Lemma ITHl thev are identical. Note that 
MT xN and NT x words have the same iV-run structure, the only difference is the 
length of the prefix. 
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If w e MTn then let w = a"=t"i . . . t^^a"" and let 
u — u't^^a^"'-^t^^ . . .t^^a^" where u' evaluates to a" and is type X or E. The 
words t^w and t^u evaluate to the same power of a, which is 6^2'^ + . . . + eo = 
n2^ + ?7fc_i2*''~^ + . . . + ?7o- Let i G N such that €j = rjj for all j < i and ti ^ rji. 
Then cancelling and dividing through by 2* we get 

(2) ek^""-' + ... + e, = n2''-' + + + 

li i = k then — n. Now \ek\ < 3 and u' is an i? or X word with the same 
a-exponent. By Lemma [T7I if u' is type X then it evaluates to with \N\ > 4, so 
u' is type i?, indeed it is exactly a^*" , so w and m are identical. 

If j < /c then Ci, = ±1 since they are in the middle of a run, and have the same 
parity by Equation|2| If i < k+1 then we have a contradiction since e^+i = Tyi+i = 
and the equation has 42; + 1 on one side and 4j/ — 1 on the other for integers x, y. 
So i = A: + 1 and efc2 + e^-i — n2 + ?7fe_i so n = e^, ± 1 since ^k-i ^ Vk-i — i2, and 
efe_i has the same sign as e^. 

If u' is type X then |n| > 4 by Lemma [T7I but |efc| < 3, so the only chance for 
equality is when the run in w starts with 31 and rjk-i — —1- Then u' ^bs o,'^ 
which is written as ta?t~^ in a normal form word, but then the run in u starts with 
20(— 1) which is not allowed. Thus u is also in MTn- Without loss of generality 
assume > so ek-i — 1 and rjk^i = — 1. Then n must be negative since the run 
in u starts with n(— 1), and we have a contradiction. □ 

Lemma 20 (Uniqueness for A/'J^pUA/'J^px)- Ifw,u ^ AfJ- p^jNT px andw ^bs u 
then w and u are identical. 

Proof. If w,u e NTp UMTpx then w^^ and are in MJ-n UMTxn, so by 
Lemma [T^ since =bs then w^^ and u^^ are identical, and so w and u are 
identical. □ 

Lemma 21 (Uniqueness). Every group element is represented by a unique normal 
form word. 

Proof. If w and u are two normal form words representing the same group element, 
then they have the same t-exponent by Lemma |2| 

If w and u have zero <-exponent then they are of the form E, X, NP or XNP. 
If neither is NP or XNP then they are identical by Lemma ^1 If one is NP or 
XNP then let w = w't'^a^'^-H-^ . . .t^^a^^t'' and u = u't-^a'i'-H-'^ . . .t-^a^H^ 
where w' ,u' evaluate to powers of a and assume without loss of generality that 
fc > and fe > /. Then wu-^ = w't-^a'^'-H-^ . . . f'^a'^H^^^a^'^H . . . ta-""-^ (u')^^ 
=BS 1- Since fc > then eo = ±1 so if we replace w' and u' by the corresponding 
powers of a (by pinching ta^t~^ subwords) we have a word that does not admit any 
pinches, contradicting Britton's Lemma. Thus k — I. Then the words wt~'' and 
ut~'' are equal and in MTn ^J^^xn so by Lemma [T^ must be identical, so w and 
u are identical. 

If w and u have negative i-exponent then they are of the form N, XN, NP or 
XNP. If neither is NP or XNP then they are identical by Lemma ITI!! If one is 
NP or XNP then let w = w't-^a^^-H'^ . . . t'^a^^t' and let 

u — u't^^a^'p-^t^^ . . .t^^a^°t'^ where k > l,p > q, and w',u' evaluate to powers of 
a. Assume without loss of generality that I > and I > q. Then 
wu"^ = ■u/t-ia<='=-it-i . . .t-ia<^«i'"9a-''n. . .ta-''p-i(M')-i =bs 1. Since I > 
then eo = ±1 so after replacing w' and u' by the corresponding powers of a, we 
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have a word that does not admit any more pinches, contradicting Britton's Lemma. 
Thus I — q. Then the words wi"' and ut~^ are equal and in MJ-n UJVJ^xn so by 
Lemma Il9l must be identical, so w and u are identical. 

If w and u have positive i-exponent then they are of the form P, PX, NP or 
NPX. If neither is NP or NPX then they are identical by Lemma EOl If one 
is NP or NPX then let w = t~''a''H . . . ta'^-^w' and let u = t'Pa'^H . . . ta'"=-i*"' 
where k < l,p < q, and w',u' evaluate to powers of a. Assume without loss of 
generality that fc > and k > p. Then 

u-'^w = {u'y^t-'^a-'^''-H-^ . . .t-'^a-'^HP-'^a^H . . .ta^>'-^w' =bs 1- Since fc > 
then eo = ±1 so after replacing w' and u' by their corresponding powers of a we 
have a word that cannot be pinched, contradicting Britton's Lemma. Thus k = p. 
Then the words t^w and t'^u are equal and in AfTp yjMJ-px so by Lemma QUI must 
be identical, so w and u are identical. □ 

Lemma 22 (Normal forms are geodesic). Each normal form word is a geodesic. 

Proof. Suppose that a word w G NT is not geodesic. Choose a geodesic word 
u ^BS w that is one of the ten types in Lemma |S| By Lemma 1121 we can move u 
into a word u' of the same length having one run. 

If u' is in normal form then since w and u' are both normal form words that 
equate to the same group element then w, u' must be identical by Lemma 1^ 

If u' is not in normal form, it either violates the prefix rules (as in Lemma |13|I 
or has an adjacent pair of nonzero digits in its run. 

If the run in u' has an occurrence of 1(— 1) or (—1)1 then u' is not geodesic. If 
the run in u' has an occurrence of 11 or (— 1)(— 1) that is not at the start of an 
A^-run or the end of a P-run, then by Lemma lTTl we can perform a length preserving 
rewrite to eliminate it. If this causes u' to have a 1(— 1) then u was not geodesic, 
and it it causes u' to have a 11 or (— 1)(— 1) then repeatedly applying Lemma ITTl 
from right to left in an A^-run, or left to right in a P-run, we can eliminate all 
occurrences of pairs of nonzero digits. 

Finally if the start or end is not one of the prefixes in Lemma Il3l then either u' 
is not geodesic (if the prefix is 20(— 1) for example), or is equal to a normal form 
word of the same length, which means that the original word w is geodesic. □ 

5. The main theorem 
Theorem 5.1. The language MT is a 1- counter language. 

Proof. The ten types of normal- form geodesies listed in Definition |H1 break up into 
five cases. The set NT e is a 1-counter language since it is finite. We can describe 
a Z-automaton for each of the remaining four cases to accept the remaining nine 
types. 

Consider the set of normal forms words of type A, AA^ and XNP. The language 
Li of Lemma lT^ describes the set of normal form words of these types with at most 
two t^^ letters in the A^-run, and since Li is finite, it is a regular language. 

Let L; be the set of words of the form {t^a^-'^at-^ ,t^aH-'^a-^t-^\ fc = 1, 2, 3, i = 
2, ±3, j = —2, ±3}. This is a finite set so is regular, and is the set of X (and AA^) 
normal form words with three t~^'s in the A^-run, that corresponds to the prefix 
201, 301, 30(— 1) and their negatives. 

The remaining A, AA^ and XNP normal form words (with an A^-run of 3 or 
more letters) are accepted by the automaton on the left of Figure El The edge 
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labeled k stands for a collection of paths labeled by 
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The union of these three (regular and 1-counter) languages is 1-counter. 




Figure 12. Counter automata for normal form X,XN,XNP 
words and N, NP< words with A^-run length at least 3. 



Next, consider the set of normal forms words of type N and NP<. The language 
L2 of Lemma ini describes the set of normal form words of these types with at most 
two letters in the TV-run, and since L2 is finite, it is a regular language. 



Let L'2 be the set of words of the form {a^t 



-2„±1 



t-'li^ 0, ±1, ±2, ±3}. This is a 
normal form words with three 



finite set so is regular, and is the set of N (and NP< 
<-i's in the A^-run, that corresponds to the prefix 00(±1), 10(±1), 20(±1), 30(±1) 
and their negatives. 

The remaining TV and NP< normal form words (with an A^-run of 3 or more 
letters) are accepted by the automaton on the right of Figure [T^ The edge labeled 
k' stands for a collection of paths labeled by 



.-){t-\-){t-\-), 
.-)it-\-)at-\t-\-), 

Next, consider the set of normal forms words of type P and NPy. The language 
is of Lemma lT^ describes the set of normal form words of these types with at most 
two t letters in the P-run, and since L3 is finite, it is a regular language. 

= 0, ±1,±2,±3}. This is a 



= 0,±1,±2,±3 
= 0,±1,±2,±3 
= 0,±1,±2,±3 
= 0,1,2,3; 
= 0,-1,-2,-3 



Let L3 be the set of words of the form {ta t^a 



finite set so is regular, and is the set of P (and NP^) normal form words with three 
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t's in the F-run, that corresponds to the suffix (±1)00, (±1)01, (±1)02, (±1)03 and 
then negatives. 

The remaining P and 7VP> normal form words (with a P-run of 3 or more t 
letters) are accepted by the automaton on the left of Figure ^1 The edge labeled 
A stands for a collection of paths labeled by 



(t,+)(t,+)(t,+)a^ 
{t,+){t,+)a{t,+){t,+)a' 

(t,+)(t,+)a(t,+)a* 
{t,+){t,+)a-\t,+)a' 



0,±1,±2,±3 

0,±1,±2,±3 

0,±1,±2,±3 

0,1,2,3; 

0,-1,-2,-3 





(^+) 



Figure 13. Counter automata for normal form P, NP^ words and 
PX, NPX words with P-run length at least 3. 



Lastly, consider the set of normal forms words of type PX and NPX. The 
language L4 of Lemma El describes the set of normal form words of these types 
with (at most) two t letters in the P-run, and since L4 is finite, it is a regular 
language. 

Let L4 be the set of words of the form {tat^a^t^^, ta^^t^aH^'' , \ fc = 1, 2, 3, z = 
2,±3,j = —2, ±3}. This is a finite set so is regular, and is the set of PX (and 
NPX) normal form words with three t's in the P-run, that corresponds to the 
suffix 102, 103, (—1)03 and their negatives. 

The remaining PX and NPX normal form words (with a P-run of 3 or more t 
letters) are accepted by the automaton on the right of Figure [T51 The edge labeled 
A' stands for a collection of paths labeled by 

(t,+)(<,+)(t,+)a' i = ±2,±3; 

{t,+){t,+)a{t,+){t,+)a' i = 2,±3; 

(t, +){t, +)a-i(t, +)(t, +)a* i = -2, ±3; 

{t,+){t,+)a{t,+)a^ i = 2,3; 

(t,+)(<,+)a-i(t,+)a* i = -2,-3. 

By LemmaOlthe union of a 1-counter and a regular language is 1-counter so each 
of the ten types is 1-counter, and by Lemma |21 the union of 1-counter languages is 
1-counter. □ 



A CONTEXT-FREE AND A 1-COUNTER GEODESIC LANGUAGE FOR A BAUMSLAG-SOLITAR GROlffi 

Corollary 4. The language of normal forms for BS{1, 2) with the standard gener- 
ating set is context-free. 

6. Full language of geodesics 

In this section we prove that the language of all geodesic words in the standard 
generating set is not counter. To prove this we will mimic the proof of Theorem l3.1l 
Recall that in that proof we constructed a word ww^ on three symbols whose prefix 
is square-free and sufhx is its reverse, and applied the Swapping Lemma (Lemma 

to obtain a contradiction. 

Let w be a word in BS(1, 2) with no letters. Define the t-encoding of w to 
be a string of integers nin2 . . . such that w = t"^at"^ . . . at"'' . If w starts (or 
respectively ends) with an a then ni = (or respectively Uk = 0). 

As an example, the word 

at'^a'^ta^t^at^^at'^at^^ = t°at'^at°atat°at°at^at^^at'^at^^ 

is encoded as 0201004(— 9)2(— 1). Note that previously our encodings have been of 
a-exponents, but this new encoding will be useful for the argument to follow. 




Figure 14. A finite state automaton accepting the language L in 
the proof of Theorem 16.11 

Theorem 6.1. The language of all geodesic words in BS{1,2) with respect to the 
generating set {a^^ ,t^^} is not counter. 

Proof. Suppose that the full language is counter, and call it C. Define L to 
be the set of words in {0,*"*=^} accepted by the finite state automaton in Figure 
IT^ That is, L is the set of PN words whose ^-encodings are words of the form 
{10, 20, 30}{10, 20, 30}*0{-10, -20, -30}{-10, -20, -30}*. 

Since L is regular, the intersection of C and L is counter. Let M be a counter 
automaton accepting CnL, with alphabet a^^,t^^. We can construct a new counter 
automaton M' which accepts the set of i-encoded words of C fl i as follows. 

The states, start state, accept states and counters are the same as for M. The 
new alphabet is {0, ±10, ±20, ±30}. The transitions are defined as follows. 

If there is a path labelled by f^a in M from p to q, then add an edge in AI' 
from p to q labeled by i, and the counters are changed by the same amount as they 
were following the path fa in M. Thus a word is accepted by M if and only if 
its encoding is accepted by M'. Since M accepts CnL, the only subwords of the 
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form t*a that appear in accepted words are for i — 0, ±10, ±20 or ±30. Let p be 
the swapping length for M'. 

Next, take a Thue-Morse word in three symbols, which we choose to be 10, 20, 30, 
of length greater than 2p. This word encodes a P word u of some t-exponent 10c. 
We wish to find some kind of "reverse" of u, as we did in the proof of Theorem l3.1l 
We find a word v to act as the "reverse" by the following procedure. 

(1) Write u as t^^a^H^^a^^ . ..t^^a^H^" where e, = 0, 1. 

(2) Reverse this word. 

(3) Replace a'^ with and with a'^ in this word. 

(4) Replace with in this word to get v. 

For example, the Thue-Morse word 10,20,30,10,30,20,10,20,30,20,10,30 en- 
codes the word 

Step 1: Write u as 

u = |ai|a°|a>°|a°|a»°|a°|ai|a°|ai|a>°|a>°|a°|ai|a°|a»°|a°| 

where the terms are replaced by bars |, to make it easier to read. 
Step 2: Reversing this word gives 

= |a°|a°|ai|a>°|a>°|a°|ai|a°|a»°|a>°|a°|a»°|a°|a>°|ai|. 

Step 3: Replacing a° by and vice versa gives 

|a»°|a°|ai|a°|o»°|ai|a"|a"|a>"|a»°|a°|ai|ai|a°|a>°|. 

Step 4: Replacing t^" by gives 

V = taHaHa°ta°taHa°taHaHa°taHa°ta"taHa"taHaHa°ta"taHaHa°taHa°t 

where f represents 

The t-encoding for v is then 

(-10)(-10)(-30)(-20)(-10)(-20)(-30)(-20)(-10)(-30)(-10)(-20)(-20). 

Note that v does not have to be square-free. Note also that the t-exponent of v is 
— 10c, where 10c is the t-exponent of u. 

Now to understand what motivated us to produce this v from u, consider the 
word w — ua?v = uat^av. This word is type X . Drawing w in a sheet of the Cayley 
graph we see that at every tenth level there is an a letter, either on the part going 
up the sheet (the u part) or the part going down (the v part). See the left side of 
Figure El 

We will now show that w is a geodesic. Consider the word w' obtained from w 
by commuting all a letters to the right. Since there is exactly one a at every tenth 
level of w, we have w' = t^^'^a^t~^'^ {at~^'^y~^ . Then w' is a normal form X word, 
since its A^-run is of the form 200 . . . with no consecutive non-zero entries. Thus by 
Lemma |22l is geodesic, and since w' has the same length as w then w is geodesic. 
So w is in C n L, it is accepted by the counter automaton M , and its t-encoding is 
accepted by M'. 

Applying the Swapping Lemma (Lemma 0)) to the encoding of w, we switch two 
adjacent subwords in the first half of w, that is, in the t-encoding of u, which is 
square- free. 
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Figure 15. The word w = ua?v drawn in a sheet of the Cayley graph. 

This new string is a t-encoding of some other word in the group, which is an X 
word, essentiahy the same as w except that at some level(s) we see a shift one step 
to the right in both sides of the word (viewed in the sheet of the Cayley graph) . 
See Figure [TCI 

When we commute a-letters to the right in this word, we wiU see t^^a^t^^ at 
some point(s) in the A^-run, and thus the swapped word is not a geodesic, so not 
in C n L, and this is a contradiction. □ 
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